0%

(CVPR 2017) Rethinking Atrous Convolution for Semantic Image Segmentation

Keyword [DeepLabv3] [Dilated Conv] [ASPP]

Chen L, Papandreou G, Schroff F, et al. Rethinking Atrous Convolution for Semantic Image Segmentation[J]. arXiv: Computer Vision and Pattern Recognition, 2017.


1. Overview


In this paper, it proposes DeepLabv3 for segmentation.

  • Design modules which employ Atrous Conv in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates (multi-grid).
  • Augment Atrous Spatial Pyramid Pooling (ASPP) module.
  • Remove CRF.

1.1. Multi-scale Methods




2. Details


2.1. Cascaded Modules



2.2. Parallel Modules



2.3. Multi-graid Method

1) Apply different atrous rates to 3 Convs within $block4$ to $block7$.
2) If $MultiGrid=(1,2,4)$ and $rates=2$, then $MultiGrid=2 \cdot (1,2,4)$.

2.4. ASPP

1) Contains:3 Dilated Conv, 1 $1 \times 1$ Conv and Global AVGPool.
2) When rate is too large, Dilated Conv degrades to $1 \times 1$ Conv.




3. Experiments


3.1. Cascaded Modules

3.1.1 Output Stride



3.1.2. Deeper



3.1.3. Multi-Grid



3.1.4 Inference Strategy



3.2. Parallel Modules



3.3. Comparison